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Foreword 


ISO (the International Organization for Standardization) and IEC (the 


work. 


of technicat cemmittees is to prepare International 


Standards. eptional circumstances a technical committee may 
propose the pubes ofa Technical Report of one of the following 
types : 


— type 3, when a technical committee has collected data of a 
different kind from that which is normally published as an 
International Standard ("state of the art", for example). 


Technical reports of types 1 and 2 are subject to review within three 
years of publication, to decide whether they can be transformed into 
International Standards. Technical reports of type 3 do not 
necessarily have to be reviewed until the data they provide are 
considered to be no longer valid or useful. 


ISO/IEC TR 15907, which is a Technical Report of Type 3, was 
prepared by Joint Technical Committee ISO/IEC JTC 1, Information 
technology, Subcommittee SC 2, Character sets and information 
coding. 
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Introduction 


Optical Character Recognition technology, OCR, was developed in 
the sixties, and some specialized OCR fonts were then designed. In 
1976 two such fonts were formally standardized by ISO, OCR-A and 
OCR-B. 


In 1993 a need was seen to extend the OCR-B set of "glyphs" (i.e. 
the printed images of characters) with a number of national letters 
used in Latin alphabets. A revision of the standard was therefore 
started, and the needs of European-origin languages identified. The 
revision was processed through three successive CDs. 


However vendor and user support for testing of the new glyphs could 
not be secured, and it was therefore decided in 1997 to halt the 
revision until such support becomes available. 


TECHNICAL REPORT © ISO/IEC 


Information technology 


ISO/IEC TR 15907:1998 (E) 


Revision of OCR-B standard (ISO 1073/II-1976) 


1 Scope 


This Technical Report documents the work 
performed on a revision of the OCR-B font standard. 
That revision has been halted, and the work item 
changed to the production of this report; see 
revision history below. 


2 References 


ISO 1073/1-1976, Alphanumeric character sets for 
optical recognition — Part I: Character set OCR-A — 
Shapes and dimensions of the printed image 


ISO 1073/ll- 1976, QU prd character Boe for 


ISO 1831-1980, Printing specificati 
character recognition 


ISO/IEC 9541-3:1994, Informatio 
Font information interchange 
representation 


échnology — 
x Glyph shape 


S 
4 Contents of report 


The text of this report does not cover Optical 
Character Recognition (OCR) technology as such. 
Neither is the related subject of bar coding covered. 


The report however provides, in clauses 5 and 6, 
some information of a general nature on OCR 
application areas. The characteristics of the OCR-B 
font are described in clause 7, and the standard's 


revision history i 
general conside 


latest CD should form the basis for such work. 


he differences between Annex A and the latest CD 
are described in Annex B. Annex C contains a 
listing of glyph extensions as requested by National 
Bodies. 


Annex D is a sample definition of an OCR-B glyph 
using the shape definition syntax of ISO/IEC 9541-3 
(see clause 9). 


5 OCR application areas 


For the purpose of this report, it is suitable to 
distinguish between two differing OCR application 
areas, namely: 


— reading of information in predefined fields of 
form-type documents 


— reading of continuous-text documents 


(Although in both areas not only printed but also 
hand-written text applications exist, the latter case is 
not considered in this report.) 


OCR technology originally concentrated on the first 
of the areas, particularly within the financial sector. 
Already in the sixties a number of "Optical Readers" 
were marketed, capable of recognizing the glyphs of 
a few different fonts. 
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Applications in the second area were also in small- 
scale use in the sixties. The technology was greatly 
accelerated by the introduction of Personal 
Computers and the development of low-cost 
scanning equipment for such machines. 


Forms-reading applications probably still constitute 
the largest volume of today's OCR processing. In 
general they place very high requirements on the 
reliability of glyph recognition and on the speed of 
processing. The hardware and software for these 
applications is generally tightly integrated, and 
marketed as packaged solutions. 


Continuous-text reading applications generally 
accept lower reliability in glyph recognition and lower 
processing speed. In this case many independent 
hardware and software vendors exist, meaning that 
most software is designed to work with a number of 
different scanners. 


For forms-type applications a specific font is usually 
specified. The recognition process therefore 
matches read images against shapes known in 
detail. In text document scanning, matching is 
instead to generic glyph shapes, and the process is 
"self-learning", sometimes referred to as Intelligent 
Optical Character Recognition, IOCR or ICR. This 
type of software usually permits operator interaction 
with the recognition process. 


The development of the OCF 
sixties necessitated on 


For the te work was carried out 
in the resulting in two different 
glyph sche 6 these two were used to 


P 
This standard is complemented by ISO 1831, which 
covers printing quality requirements for the fonts as 
well as methods for conformance testing. 


In 1994 an extension to the OCR-B font was propos- 
ed, to better accomodate European national letters. 
A revision of part Il of the standard was therefore 
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started within ISO/IEC JTC 1/SC 2. It was however 
halted in 1997; see revision history below. 


At the moment (June 1998) an interest exists within 
CEN to extend the glyph repertoire of OCR-B with a 
sign for the Euro currency. This may possibly result 
in the development of an EN (i.e. CEN standard) 
based on the OCR-B standard. 


7 OCR-B characteristics 


The OCR-B standard specifies atotal 0121 glyphs. 
During the work started o 
standard it has however b 


hree sizes. Size | is the 
e.g. in machine-readable 
uropean banknotes. lts 


of most strokes are rounded. This version is 


the one usually used. 


e second version was intended for more capable 
printing processes, and conforms to traditional 
typographical practices, with e.g. differing thickness 
of horizontal and vertical strokes. It is therefore 
termed "letterpress", and is useful as a font for 
typewriter-class applications. The ends of the 
strokes are squarely cut off, not rounded. 


The constant-strokewidth version is primarily 
intended for use with a fixed horizontal spacing 
("fixed pitch") of 2,54 mm, i.e. 10 characters per inch 
(10 cpi). The glyphs are however designed to permit 
a fixed spacing of 2,12 mm also, i.e. the 12 cpi more 
common in Europe. The letterpress version, 
although useable for fixed-pitch printing, is primarily 
intended for "proportional spacing" as normally used 
for text nowadays. 


When output on medium- and high-quality printers 
the differences between the two versions can 
generally be distinguished by the human eye. For 
most OCR readers, on the other hand, the two font 
versions are in practice identical; differentiating 
between them certainly requires OCR readers with 
better than 300 "dots per inch" resolution. 
Considering inevitable printing defects and 
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tolerances, the two versions could therefore be seen 
as interchangeable from an OCR point of view. 


The present OCR-B standard’s glyph repertoire is 
illustrated in Annex A clause 16 at Size |, and also 
magnified from that four times (which does not 
correspond to any of the three sizes; this is for 
illustration only). 


8 OCR-B revision history 


In 1993 an extension was proposed in JTC 1/SC 17 
to the standard for machine-readable passports 
ISO/IEC 7501-1, by the Turkish National Body. That 
standard does not permit any other letters than 
capital A—Z for writing names in the part of the 
passport intended for machine-reading (although 
they are permitted in the "Visual Inspection Zone" of 
the document). 


The Turkish NB considered it necessary to have the 
possibility of representing names in a correct way for 
machine-reading also. This would avoid the ambigu- 
ity inherent in most transliteration schemes. 


SC 17 atthe time noted that the OCR-B standard, as 
referred to in ISO/IEC 7501, defines a very small 
repertoire of national letters, not completely covering 
e.g. the Turkish requirements. (SC 17 has also 
since concluded that, even if the OCR-B repertoire 
were extended with such letters, allowing them in 
passports would cause an unacceptable deerease in 
recognition performance, at least 
equipment.) 


As a consequence of the SC 
creation of a new part of | 
proposed by the Turkish4 
meeting in 1994 it was how 
start a revision of the OCR-B s ndard Q extend its 
glyph repertoire, and a 


This revision number 
JTC 1.02.26. 


The origi i for a rather small 


this planting. Thereforé a number of CDs had to be 
produced; i 4, November 1995 and October 
1996. & 

dëi 


Since however nevessary industry and user support 
could not be secured for finalizing the revision, in 
particular for testing the new glyphs proposed, SC 2 
decided in July 1997 to halt the revision, and to 
produce this report instead. 
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Due partly to CEN requests for a Euro sign in 
OCR-B, the revision was again discussed in the 
February 1998 SC 2 Plenary. It was decided that the 
report should be finalized, even if some parallel work 
was started in CEN. 


9 Extension considerations 


The original OCR-B design concentrated’ on the 
digits 0-9, capital letters A-Z, currency’signs and 
some punctuation marks. 
— which is the same as for OC 
considered sufficient for norm 


i (like j). Neither would capital letters with 
diacritical marks (like E) or small letters with 


ascenders and diacritical marks, as used in some 
Latin alphabets, stay within that frame. 


herefore extensions of the OCR-B glyph repertoire, 
as requested in the revision process, may be 
problematic for some existing OCR applications. 
Only if dimensional extensions to the recognition 
areas are possible could such extensions be 
permitted. 


Specifically in the case of Machine-Readable Travel 
Documents according to ISO/IEC 7501 the defined 
area for OCR information could accomodate such 
increased size. It would however severely limit the 
tolerances needed both when printing such 
documents and when reading them. 


For the current OCR-B repertoire, all glyphs are 
specified by scale 100:1 reference drawings (see 
Annex A 14.1). In the case a repertoire extension is 
decided, glyph specification by outline definitions 
according to ISO/IEC 9541-3 should be considered 
instead. Such a sample definition is illustrated in 
Annex D. 


ISO/IEC TR 15907:1998 (E) @ ISO/IEC 


(This page intentionally left blank) 


© ISO/IEC 


ISO/IEC TR 15907:1998 (E) 


Annex A 


Editorially-revised text for new OCR-B standard 


Information technology 


Alphanumeric character sets for optical recognition 


Part 2: Character set OCR-B - 


Shapes and dimensions of the printed image 


1 Scope 


This International Standard defines two sets of glyph 
images, designated OCR-A and OCR-B, and 
intended primarily for use in Optical Character 
Recognition (OCR) applications; but suitable also for 
visual, i.e. human, reading. It does not relate any 
coding scheme with these images (see clause 5). 


NOTE - In the previous edition of this standard the term 
"character" was used not only in its strict sense, but also to 
mean the printed images used to represent characters 
visually. In this edition the term "glyph image" has been 
introduced for the latter meaning (except in the title of the 
standard, which has been kept unchanged). 


This standard contains information 
dimensions for the glyph images, 


are however covered 
Standards (see clause 3). 


In this part 
specified. 


are designed for combination 
duce composite glyph images 
asic images. 


with small lette 
complementing the? 


2 Conformance 


A printing or OCR reading device is in conformance 
with this standard if it can generate/recognize, for 
either or both of the defined styles (see clause 6) 
and in one or more of the specified sizes (see 


clause 7), all or part/of 
subsets (see clausé 9) 


hè specified glyph image 


recognized. Such a specification shall take the form 
of a reference 


printing or OCR reading device must 
conformance to International Standard 


Printed images produced by an OCR-B printing 
device are in conformance with this standard if their 
nominal shapes and dimensions are in accordance 
with their respective reference drawing(s) (see 
clause 14); with the claimed conformance to 
tolerances and printing quality factors specified in 
standard ISO 1831 considered. 


3 Normative references 


The following standards and other documents 
contain provisions which, through reference in this 
text, constitute provisions of this International 
Standard. At the time of publication, the editions 
indicated were valid. All standards are subject to 
revision, and parties to agreements based on this 
International Standard are encouraged to investigate 
the possibility of applying the most recent editions of 
the standards indicated below. Members of IEC and 
ISO maintain registers of currently valid International 
Standards. 


ISO 1831-1980, Printing specifications for optical 
character recognition 


OCH-B character reference drawings and glyph 
definitions; see clause 14 
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4 Definitions 


For the purposes of this standard, the following 
definitions apply: 


4.4 character: A member of a set of elements 
used for the organisation, control or representation 
of data. 


4.2 coded character set: A set of characters, 
defined by unambiguous rules that establish the 
character set and the relationship between the 
characters of the set and their coded 
representations. 


4.3 composite glyph image: An image printed on 
paper or any other medium intended for OCR 
applications, obtained by superimposing two or more 
glyph images on the same area. 


4.4 glyph: Arecognizable abstract graphic symbol 
which is independent of any specific design. 


4.5 glyph image: An image of a glyph, as 
obtained from a glyph representation printed on 
paper or any other medium intended for OCR 
applications. 


NOTE - The definition above of "coded character set" 
differs slightly from definitions in other ISO/IEC standards, 
and the definition of "glyph image" is more limited. The 
definition of "composite glyph image" is specific to this 
standard (at the time of its publication). 


5 Coding in OCR applicatio 


sets. 
J plications based on this 
standard rust thereforé define, through reference to 


glyph images whith Re available for printing and/or 
shall be recognized. and for each image the 
corresponding character and its coding. 


6 OCR-B styles 


The OCR-B glyph images are defined by this 
standard in two different styles. 
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The "constant-strokewidth" style is intended primarily 
for printer equipment in which the width of the 
strokes of the images is less controllable. This is for 
instance the case for some types of mechanical 
printers. 


The "letterpress" style is intended for printing 
equipment which can reproduce fine details with high 
accuracy. For aesthetic reasons, the strókewidths of 


strokes; [ 
complete outlines 


quality characte Istics. 
Sch The metric and inch dimensions in this Inter- 
ational Standard are rounded and therefore consistent but 


not éxactly equal. Either system may be used but the two 
should not be intermixed. 


D 


2 The letterpress font is specified in size | (the 
smallest) only. It provides the option of a variable 
pitch in printing as is usual with letterpress. 


7.3 The constant-strokewidth font is specified in 
three sizes: |, Ill and IV. Mechanisms using the 
constant-strokewidth font will usually maintain a fixed 
pitch. 


NOTE - Size II which was originally in this standard has 
been deleted. 


In fixed-pitch printing for OCR applications, the 
following minimum nominal pitches are recom- 
mended: 


size |: 2,54 mm (0,100 in) 
size Ill: 2,54 mm (0,100 in) 
size IV: 3,63 mm (0,143 in) 


7.4 The centrelines for the three sizes are simply 
related by appropriate horizontal and vertical scale 
factors. The factors for size Ill and size IV referred 
to size | are: 


Vertical 1,333; horizontal 1,086 
Vertical 1,500; horizontal 1,500 


for size lll: 
for size IV: 
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This scale relationship does not apply to the outline 
shapes since nominal strokewidth is not strictly 
proportional to  centreline dimensions. The 
strokewidths for each size are shown in the 
reference drawings. 


7.5 The glyph image with the greatest height above 
the base line ("A" in figure 1) in each size is DIGIT 
EIGHT. The image with the greatest total height is 
SMALL LETTER J, because of its descender. 


The centreline heights of the DIGIT EIGHT are: 


for size I: 2,40 mm (0,094 in) 
for size Ill: 3,20 mm (0,126 in) 
for size IV: 3,60 mm (0,142 in) 


7.6 The widest glyph image in each size (except for 
the alternative SMALL LETTER M) is DIGIT ZERO. 
Its centreline widths are: 


for size I: 1,40 mm (0,055 in) 
for size Ill: 1,52 mm (0,060 in) 
for size IV: 2,10 mm (0,083 in) 
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8 Typical dimensions of the nominal 
printed image 


Typical dimensions for the nominal printed image of 
the letterpress font in size | are given in table 1. 
These dimensions are the heights above and below 
the horizontal base line of digits, capital and small 
letters, ascenders and descenders (see figure 1). 


The shapes and dimensions of 
strokewidth glyph images are similz 
stroke ends are rounded. 


NOTE - 


the/ constant- 
f except that the 
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9 OCR-B GLYPH IMAGE SET 


The full set contains 116 glyph images and a definition for SPACE (see clause 13). Four subsets are defined: 


9.1 Subset 1: Minimal alphanumeric subset 


This subset applies to sizes |, III and IV in constant-strokewidth font and to size l in letterpress font. It contains 
21 glyph images and SPACE: 


0125456789 
CENSTXZ 
< + > | SPACE 


9.2 Subset 2: Basic alphanumeric subset 


This subset applies to sizes | and IV in constant-strokewidth font and to s 
25 glyph images in addition to subset 1, i.e. a total of 46 glyph images and 


0125456789 
ABCDEFGHIJK 
NOPQRS YZ 
<+>*- | SPACE 


etterpress font. It contains 
SPACE: 
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9.3 Subset 3: Extended alphanumeric subset 
This subset applies to sizes | and IV in constant-strokewidth font and to size | in letterpress font. 


It contains 50 glyph images in addition to subset 2, i.e 96 glyphs in all (and SPACE); in particular those images 
corresponding to the characters listed in ISO/IEC 646 as unique, alternative, and International Reference 
Version. 


I"HRf£un$AZ&'()*t,-./ 
0125456789: ;<=>? 
gABCDEFGHIJKLMNO 
PQRSTUVWXYZL\J 
‘abcdefghijklm 


parstuvwxyz PE 

9.4 Subset 4: Options subset 

This subset applies to sizes | and IV in font, and to size | in letter-press font. It contains 
8 capital national letters, 5 small natio i cal marks and 3 further glyph images. 

Images from this subset shall be | junction with subset 3. A printing or OCR reading device may 
generate/recognize any of the imag is-subset/ The images generated/recognized by the device shall be 
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10 Index table 


10.1 All glyph images are available in size | as 
constant-strokewidth font and as letterpress font. 


Only the images of the minimal alphanumeric subset 
(subset 1) are available in size Ill as constant- 
strokewidth font. 


All images are available in size IV as constant- 
strokewidth font, with the exception of VERTICAL 
LINE. 


10.2 In the following table each image is given with 
the number of its reference drawing(s) and the 
subset(s) in which it is comprised. 


The drawings are identified as follows : 


L: for letterpress font, size | 
C: for the constant-strokewidth font, size | 
lll: for the constant-strokewidth font, size Ill 


10 
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10.3 As stated in 14.6, the shapes for size IV are 
derived from those of size l for the constant-stroke- 
width font (designated by C). 


10.4 Application advice is given in the column 
"Remarks". 


It is recommended that prospective users of this 
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Table 2 — OCR-B glyph image set 
Drawing(s) 
N 


O. 

1 DIGIT ONE 
L.C, Ill 

5 


L, C, Il DIGIT THREE 
L.C, Ill DIGIT FOUR 


DIGIT FIVE 


2 

L C. II DIGIT TWO 
3 
4 


L, C, III 


6 


L.C, Ill DIGIT SIX 


DIGIT SEVEN 


E dcn ad icd 3 


Z 
E 


al 
oon 


LATIN CAPITAL LETTER E 
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Table 2 (continued) 


LATIN CAPITAL LETTER F 
LATIN CAPITAL LETTER G 
LATIN CAPITAL LETTER H 
LATIN CAPITAL LETTER | 


LATIN CAPITAL LETTER J 


LATIN CAPITAL LETTER K 


LATIN CAPITAL LETTER S 


LATIN CAPITAL LETTER T 


12 


© ISO/IEC 


Table 2 (continued) 


Drawing(s) 
No. 


LATIN CAPITAL LETTER U 


LATIN CAPITAL LETTER V 
LATIN CAPITAL LETTER W 
LATIN CAPITAL LETTER X 


LATIN CAPITAL LETTER Y 
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LATIN CAPITAL LETTER 


Pa 
A | 


Z 


LATIN SMALL LETTER 


L | 


LATIN SMALL LETTER G 


LATIN SMALL LETTER H 


LATIN SMALL LETTER I 


aa 


maller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 
Smaller strokewidth; see clause 14 
Smaller strokewidth; see clause 14 
Smaller strokewidth; see clause 14 
Smaller strokewidth; see clause 14 
Smaller strokewidth; see clause 14 
see clause 14 


Smaller strokewidth; 


Smaller strokewidth; see clause 14 
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Table 2 (continued) 


LATIN SMALL LETTER J 


LATIN SMALL LETTER K 


LATIN SMALL LETTER L 


LATIN SMALL LETTER M 


LATIN SMALL LETTER N 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 


<Z, 


ewidth; see clause 14 
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LATIN SMALL LETTER O 


LATIN SMALL LETTER P 


LATIN SMALL LETTER V 


LATIN SMALL LETTER W 


LATIN SMALL LETTER X 


B 


= 


3 
3 
Z 
»2 
A 


Smaller strokewidth; see clause 14 


m maller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


| = =. s 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 
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Table 2 (continued) 


61 


LATIN SMALL LETTER Z 


ASTERISK 


PLUS SIGN 


HYPHEN - MINUS 


EQUALS SIGN 


Two vertical locations are specified, one of which 
projects below the base line for capital letters; 
see 14.5 and 14.8 


Two vertical locations are specified, one of which 
SEMICOLON projects below the base line for capital letters; 
see 14.5 and 14.8 


QUOTATION MARK 


N 
A dl APOSTROPHE 
Shall be used as a stand-alone character only, 
LOW LINE and shall not be printed under another character; 
see clause 12 
QUESTION MARK g 


15 
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Table 2 (continued) 


Drawing(s) 
No. 


EXCLAMATION MARK 
LEFT PARENTHESIS 
RIGHT PARENTHESIS 


LESS-THAN SIGN 


GREATER-THAN SIGN 


Smaller strokewidth; see clause 14 
UP ARROWHEAD 


CURRENCY SIGN 


N 
89 
"£ L.C POUND SIGN 
90 
16 
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Table 2 (continued) 


VERTICAL LINE 


91 
91 TS L.C, Ill 


REVERSE SOLIDUS 


LATIN CAPITAL LETTER A 
WITH DIAERESIS 


LATIN CAPITAL LETTER A 
WITH RING ABOVE 


2 
3 


LATIN CAPITAL LETTER 
WITH DIAERESIS 


dl VTH STROK i EE — 


7 
AL LIGATURE lJ 
NZ 


ATIN CAPITAL LETTER N 
WITH TILDE 


ATIN SMALL LETTER A 
WITH RING ABOVE 


LATIN SMALL LETTER AE 


LATIN SMALL LETTER O 
WITH STROKE 


LATIN SMALL LIGATURE IJ 


Smaller strokewidth; see clause 14 


Shall also be used for 
4 LATIN SMALL LIGATURE AE 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 


Smaller strokewidth; see clause 14 
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Table 2 (concluded) 


Drawing(s) 
No. 


LATIN SMALL LETTER SHARP S 


Smaller strokewidth; see clause 14 
(German) 


DIAERESIS For use see clauses 11 and 14 
ACUTE ACCENT 
GRAVE ACCENT 


vd 


CIRCUMFLEX ACCENT S auses 11 and 14 


CEDILLA 


May be used in variable-pitch printing as a 
substitute for Ref. 49 


The glyph previously defined in this position 
(CONTINUOUS UNDERLINE) has been deleted 


SPACE is non-printing. For definition, see 
clause 13. Not all OCR readers will necessarily 
recognize SPACE 


] SECTION SIGN 
119 


NOTE - The glyphs previously defined with reference numbers 120 (CHARACTER ERASE) and 121 (GROUP ERASE) have 
been deleted. 


18 
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11 Use of diacritical marks 


11.14 A number of diacritical marks are provided l l # 
which have been designed and positioned in such a 
way that they can be combined with small letters to 
create national letters. These marks are: 
DIAERESIS 


ACUTE ACCENT 
GRAVE ACCENT 
CIRCUMFLEX ACCENT Figure 2 — Examples of composite 
TILDE 

CEDILLA 


These diacritical marks may also be used free- 
standing. 


glyph images 


11.2 The method for printing a composite glyph 
image, as well as the principle for deciding the 
coding of a recognized composite image, will be 
application-dependent, and is outside the scope of 
this standard. In particular, the naming of the 
diacritical marks in this standard, although 
corresponding to the names for free-standing marks 
in ISO/IEC 10646-1:1993, does not imply any 
specific method for combination. 


NOTES 


1 For printing applications, a letter could be followed by 
a backspace operation followed by a diacritical mark; or the 
opposite sequence could be output; or the diacritical mark 
could be made non-spacing and followed by the letter; or 
the combined letter could be composed in th 


y SMALL LETTER G and SMALL LETTER Y shall 


hot be combined with CEDILLA; 


No letter shall be combined with multiple 
diacritical marks. 


a single code assigned to the correspond Apart from these rules, this International Standard 
a "composite graphic character" imposes no restrictions on letter-mark combinations. 
"accented letter" (ISO/IEC 6937: Such restrictions should therefore be specified, as 
sequence" (ISO/IEC 10646-N necessary, for any particular printing and/or OCR 
application based on OCR-B images; or restrictions 
could be implied by definition of a character set for 
2 dvuationa letter L ining a diacritical mark the application, through reference to other 


thal specific: Ip ~ "in:the-scripls of some International Standards or otherwise. 


languages the a) is 4 separate letter of the The validity of any specific combination of diacritical 


mark and letter, as well as of any free-standing 


11.3 ive position of the diacritical mark diacritical mark, is application-dependent, and 
and of th Ained by superimposing the outside the scope of this standard. 

horizontal and vertical axes of the two glyph images NOTE - Not all letter-mark combinations permitted by the 
concerned. Exam (ec: rules above will be valid national letters. 
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12 Use of the low line glyph 


The glyph image LOW LINE shall be used in OCR 
applications free-standing only, and shall not be 
combined with (i.e. printed under) another image. 


DH 1925 


Figure 3 — Exe of use of LOW LINE 


13 SPACE 


The SPACE is an intentionally blank position in a 
line of printing. With constant-pitch printing, its 
nominal width is equal to the printing pitch (for 
example, 2,54 mm if the glyph images are printed 10 
per 25,4 mm). With variable-pitch printing, its 
nominal width is equal to the largest glyph image 
width available. 


14 Glyph image shape definition 
14.14 Reference drawings 


The shapes and dimensions of the OCR-B glyphs for 
both the letterpress and the constant-strokewidth 
fonts are specified by drawings for size | and lll. 


The characters are drawn at scale 100:1 on a 2 mm 
square AE 


The total grid measure 


Original drawina a stable base at 100:1 scale 
with the 280 mm x 380 mm grid exist in the following 
sets: 
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OD1 Letterpress font, size l. 

OD2 Letterpress font, size | with the grid 
removed over approximately 2 mm around 
the glyph image outline. This set is 
particularly suitable for photographic 
reduction. 

OD3 Constant-strokewidth font, size |, 

OD4 Constant-strokewidth font, size 


NOTE - Duplicates of the drawings ceülg/previously be 
ordered from ECMA and from the US National Bureau of 
Standards (later NIST). The drawings are at present (June 


O 


Committee. Future handling o this 


consideration. 


Constant-strokewidth font, size | 


.4.1 The nominal printed image of each glyph 
image is defined by its centreline and by its nominal 
strokewidth. The nominal strokewidth is: 


0,35 mm (0,014 in) for most of the images 


0,31 mm (0,012 in) for all small letters and the 
three images #, % and @. 


The centreline and preferred line endings and 
corners are given in drawings marked "C". Pointers 
establish the vertical position (base line) and the 
orientation. Another pointer establishes the 
horizontal position for fixed-pitch printing. 


The reference drawings for the COMMA and the 
SEMICOLON contain pointers to indicate two 
alternative vertical positions (base lines). Either 
position can be selected for the two glyph images for 
a specific font implementation, depending on the 
intended use of the font. 


14.4.2 A special effort should be made in type 
design and manufacturing to arrive at actual print 
that conforms as closely as possible to the given line 
endings and corners. This is especially important for 
the square corners of capital letters B and D. 
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14.4.83 A pointer is provided to produce the most 
aesthetic spacing of glyph images in a line of 
printing. However, on printers having a significant 
horizontal spacing tolerance it is recommended to 
use the geometric centreline of the image instead of 
the line defined by the pointer where necessary to 
achieve an acceptable image separation. 


14.5 Constant-strokewidth font, size Ill 


The nominal printed image of each glyph image is 
given by its centreline and by its nominal 
strokewidth. The nominal strokewidth is 0,38 mm 
(0,015 in). The 20 reference drawings for 0 1 2 3 4 
56789CENSTXZ «-» are marked "III" and 
include pointers. Subclauses 14.4.2 and 14.4.3 also 


apply. 


14.6 Constant-strokewidth font, size IV 


The nominal printed image of each glyph image is 
given by its centreline and by its nominal 
strokewidth. The size IV centreline is derived from 
the corresponding size | centreline (see 14.4 and 
reference drawings marked "C") by a linear 
magnification of 1,5. For example, an image 
centreline width of 2,40 mm becomes 
1,5 x 2,40 mm = 3,60 mm in size IV, and so on. 


The nominal strokewidth is: 


0,50 mm (0,020 in) for most of the glyph images. 


0,44 mm (0,017 in) for all small Jétters and the 
three images £, % and @. 


Preferred line endings and co 
accurately arrived at by a 1, 


ed accuracy from drawings 
éstablish the vertical position 
(base line), Nhe or entation and the body width. 
A pointer also éstakN$hes the horizontal position for 
fixed-pitch printing? 


The reference drawings for the COMMA and the 
SEMICOLON contain pointers to indicate two 
alternative vertical positions (base lines). Either 
position can be selected for the two glyph images for 
a specific implementation, depending on the 
intended use of the font. 
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The glyph images of the letterpress font are 
designed with minor  strokewidth variations. 
However, strokewidths are always close to the 
nominal value of 0,35 mm (0,014 in) for digits and 
capital letters, and of 0,31 mm (0,012 in) for small 
letters and the three images #, % and @. 

15 Printing the letterpress and cónstant- 
strokewidth fonts 


In order to print the letterpress font afd to achieve 


at the discretio 
printing equipment to design their type so that the 


Care should be taken that the printed image strokes 
e symmetrically distributed around the centrelines 
ás specified in this document. 


16 Illustration of OCR-B 


Figure 4 shows the complete OCR-B set in size | at 
scales 4:1 and 1:1. 


The drawings reproduced on the following pages 
show: 


— DIGIT ONE, LATIN CAPITAL LETTER E, 
SECTION SIGN and YEN SIGN in size | as 
letterpress font and as constant-strokewidth font; 


— DIGIT ONE and LATIN CAPITAL LETTER E in 
size Ill as constant-strokewidth font. 


These reproductions of the original drawings are at 
approximately 70% of full scale. 
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Text on this page to be included as Annex A (informative) to standard 


Notes on the implementation of OCR-B 


The design of the OCR-B font is based on 
fundamental aesthetic principles which, as far as 
feasible, correspond to the criteria emerging from 
the long development of our classic typography. 
One of the essential principles prescribes that in a 
letter design all vertical parts must be heavier than 
the horizontal parts. This is also true for so-called 
sans serif characters, that is for a design which at 
first sight has a thread-like appearance. This is 
precisely the case for OCR-B. 


The OCR-B can be implemented in two clearly 
different forms. It can be used as a font with 
constant-strokewidth as well as a letterpress font. 
Type generation can be based on either 
implementation. 


For printing devices like high-speed mechanical 
printers and similar machines, the centreline is the 
skeleton along which a stroke of prescribed width is 
placed. For engraving, it is recommended to use a 
tool the diameter of which is equal to the 
strokewidth. The resulting engraving is completely 


24 


étterpress font, like its varying 
ever recommended that other 


arp angles at the end of the 
okes, are implemented as far as possible. The 
less mechanical and bear more 
e forms of traditional typography to 
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Text on this page to be included as Annex B (informative) to standard 


Main differences between ISO 1073/II-1976 and this first edition 
of this part 2 of ISO/IEC 1073 


(To be prepared when standard text finalized; see also Annex B on foll page) 
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Annex B 


Main extensions in ISO/IEC CD 1073-2.3 
as compared to Annex A (pages 5-25 of this document) 


B.1 Letter combinations with diacritical marks is 
allowed not only for small letters but also for capital 
letters, thereby providing glyph representations for 
all Latin letters in the present parts of ISO/IEC 8859. 
This is described in clause 1 and combination rules 
given in clauses 11 and 14. 


B.2 The combination with capital letters increases 
the maximum height of OCR-B glyphs. This is 
specified in clause 8 (figure 1 and table 1). 


B.3 The OCR-B repertoire is extended with 40 new 
glyphs, defined in letterpress style only. Information 
relating to this extension is given in clauses 6, 7, 9, 
10, 11 and 16 (figure 4). The requested new glyphs 
are listed in Annex C. 
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er/also the 
e ten Greek 


B.4 The extension is intended to co 
Greek capital alphabet, containing 


use glyphs" 
rules are given in clausé 


B.5 Addition o : phs and the new 
combination ruJés fór capital letters means that the 
etters AANOUA in the original 
alternative combined 


efticeddn Clause 3, and further information given 
in clauses 6, 8, and 14. 
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Annex C 


OCR-B glyph repertoire extensions requested by National Bodies 


In the processing of the OCR-B revision the following 
new glyphs have been requested (named here as 
the corresponding characters according to ISO/IEC 
10646-1:1993): 


Latin precomposed and other national letters: 


LATIN CAPITAL LETTER A WITH OGONEK 

LATIN CAPITAL LETTER D WITH STROKE 

LATIN CAPITAL LETTER ETH 

LATIN CAPITAL LETTER E WITH OGONEK 

LATIN CAPITAL LETTER H WITH STROKE 

LATIN CAPITAL LETTER L WITH STROKE 

LATIN CAPITAL LETTER T WITH STROKE 

LATIN CAPITAL LETTER THORN 

LATIN CAPITAL LETTER ENG 

LATIN CAPITAL LIGATURE OE 

LATIN SMALL LETTER A WITH OGONEK 

LATIN SMALL LETTER D WITH STROKE 

LATIN SMALL LETTER G WITH CEDILLA 

LATIN SMALL LETTER H WITH STROKE 

LATIN SMALL LETTER DOTLESS I 

LATIN SMALL LETTER J WITH CIRCUMFLEX ACCENT 
LATIN SMALL LETTER L WITH STROKE 
LATIN SMALL LETTER R WITH CEDILLA 
LATIN SMALL LETTER T WITH STROKE 
LATIN SMALL LETTER U WITH OGONE 
LATIN SMALL LETTER ETH 
LATIN SMALL LETTER THORN 
LATIN SMALL LETTER ENG 
LATIN SMALL LIGATURE O 


NOTE - 


It is intended to unify the glyphs for CAPITAL 


LETTER D WITH STROKE-arid 


GREEK CAPITAL IS TTER SIGMA 
GREEK CAPITAL LETTER PHI 
GREEK CAPITAL LETTER PSI 
GREEK CAPITAL LETTER OMEGA 


Diacritical marks: 


MACRON 
BREVE 
CARON 
DOUBLE ACUTE ACCENT 
DOT ABOVE 
RING ABOVE 
OGONEK 


NOTES 


1 Additional single aríd 
to Vietnamese Bn ing 


iplé diacritical marks specific 
ayé been requested in ballot 


eof OCR-B design difficulties, and 
Aetnamese part of ISO/IEC 8859 


IMA BELOW has also been requested. 
d impractical from an OCR design point 
. A-noté about possible use of the glyph for the 
K, similar to the note in ISO/IEC 8859-2, should be 


O 
he EURO SIGN has been requested by CEN/TC304. 


her characters: 
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Annex D 
Sample glyph definition according to ISO/IEC 9541-3 


DEFINITION OF OCR-B CHARACTER "SMALL LIGATURE OE" (ISO/IEC 9541-3 SYNTAX) 


84 135 594 0 rpe 
0 78 vstem 174 78 vstem 348 78 vstem -5 69 hstem 207 69 hstem 397 69 hste 


0 hmoveto 

—80 28 -60 90 vhcurveto 55 0 30 30 10 25 rrcurveto 
10 -25 30 -30 55 0 rrcurveto 90 31 55 81 hvcurveto -78 hlineto 

-40 -12 -27 -35 vhcurveto -35 -17 20 41 hvcurveto 82 vlineto 174 eto 
118 vlineto 81 -28 60 -90 vhcurveto -55 0 -30 -24 -10 -27 rrcur 
-10 27 -30 24 -55 0 rrcurveto -90 -28 -60 -81 hvcurveto 
closepath 


78 202 rmoveto 
35 18 25 30 vhcurveto 30 18 -25 -35 hvcurveto -213 vli 
-35 -18 -25 -30 vhcurveto -30 -18 25 35 hvcurveto 
closepath 


174 0 rmoveto 
35 18 25 30 vhcurveto 30 18 -25 -35 hvcurveto -61 vlineto 
closepath 


endglyph 


THE DEFINITION ABOVE WILL PRODUCE THE |FOLLOWING GLYPH: 


WC LUN 
ee JM D) — [p — 
E rN 
(ee 
ei Bea 
| b. 


